Large stock price changes: volume or liquidity? 
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We analyze large stock price changes of more than five standard deviations for i) TAQ data for 
the year 1997 and ii) order book data from the Island ECN for the year 2002. We argue that large 
price changes are not due to large trading volumes. Instead, we find that extreme price fluctuations 
are mainly caused by a low density of limit orders stored in the order book, i.e. a small liquidity. 



The discovery of power law distributions for commod- 
ity and stock price changes 0, Q together with the 
relevance of this discovery for the practical problem of 
risk management has spurred a large amount of interest 
in the price process in financial markets. For physicists, 
the power law distribution of price changes is very ap- 
pealing as the appearance of a power law is reminiscent of 
universality and critical phenomena, thus suggesting that 
there might be a basic and universal mechanism behind 
the distribution of price changes. Many phenomenolog- 
ical as well as microscopic models have been developed, 
which are able to explain the main stylized facts about 
financial time series (j| . 

In order to understand the mechanism underlying the 
empirically observed return distribution in detail, one 
needs to study the price impact of trades. Besides the 
influence of news breaking, stock prices change if there is 
an imbalance between supply and demand. If more peo- 
ple want to buy than to sell, stock prices will move up, 
if more people want to sell than to buy, they will move 
down. This relation is q uantifie d by the price impact 
function H H H H HEfl |H 13 III 13 El, whi ch de- 
scribes stock price changes as a conditional expectation 
value of the order imbalance. The order imbalance is 
measured as the difference between the number of shares 
bought and the number of shares sold in a given time 
interval. 

Gabaix et al. have suggested that large price 

changes arc due to large order imbalances. Starting from 
the distribution of trading volumes and a fit to the av- 
erage price impact function, they suggest an explanation 
for the empirical power law distribution of stock price 
changes. This approach was criticized in [1?J, because 
the test presented in lacks power in the presence of 
correlations in the order flow and because the functional 
form used to describe the price impact of large orders 
seems to vary for different stock markets. Instead, the 
authors of jTg| conclude from an event based analysis 
that large price changes are due to the granularity of the 
order book, which gives rise to a time varying liquidity. 

Here, we present an empirical study of extreme stock 
price changes within time intervals of a length At = 
5min. We analyze one year of data for the 44 most fre- 
quently traded NASDAQ stocks. These data are con- 
tained in the Trades and Quotes (TAQ) data base pub- 
lished by the New York Stock Exchange. In addition, we 
analyze one year of order book data from the Island ECN 



for the ten most frequently traded companies [2fJ. For 
both data bases, we find little evidence that price changes 
larger than five standard deviations are explained by the 
order imbalance. For the order book data, we are able to 
reconstruct the price impact function for time intervals 
with large price changes and find that price changes are 
quantitatively explained by unusually large slopes of the 
price impact function. 

The TAQ data base contains information about trans- 
action data like the number of shares and transaction 
price as well as information about quotes, i.e. the lowest 
sell offer (ask price 5' as k(i)) and the highest buy offer (bid 
price Shid(t))- The stock price change or return in a time 
interval At is defined as 

G(t) = In S M {t + At)- \nS M (t), (1) 

where the midquote price 5m(X) = ^{Sbid(t) + Sask(i)) 
is the arithmetic mean of bid and ask price. The order 
imbalance Q in a time interval is the sum of all signed 
market orders executed between t and t + At. For 
the TAQ data, the sign of a transaction is determined 
by the Lee and Ready algorithm, which compares the 
transaction price to the midquote price. The sign is 
positive for buy orders (transaction price larger than 
midquote price) and negative for sell orders (transaction 
price smaller than midquote price). For the order book 
data, the data base contains information about the 
direction of a trade. We choose At = 5min. Returns G 
are normalized by their standard deviation <jg, volumes 
Q by erg = (\Q — (Q)\) as their cumulative distribution 
function follows a power law with exponent close to two. 
The variance for data with such a distribution is not 
well defined. 

Average price impact and large events: The re- 
lation between price change and order imbalance is de- 
scribed by the price impact function 

^market(Q) = {GAt(t))Q . (2) 

It describes the average price change G caused by an 
order imbalance Q |2l|in the same time interval. 

We ask whether the average price impact function 
-fmarket(Q) is able to describe extremely strong price 
changes G > 5 <jq. We determined all time intervals 
with price changes larger than five standard deviations 
and checked carefully that these large price changes are 
not due to errors in the data set but correspond to "real" 
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FIG. 1: (a) Average price impact function for the 44 most fre- 
quently traded NASDAQ stocks in the year 1997 with stan- 
dard deviation of the mean. Price changes larger than five 
standard deviations cluster in the region of small volume im- 
balance, all of them are clearly outside the error bars, (b) 
Same as (a) but for 2002 data from the Island ECN order 
book for the ten most frequently traded stocks. 



events. While the order book data seem to be free of er- 
rors, some errors are contained in the TAQ data. We 
have filtered the raw TAQ data against recording er- 
rors and apparent price changes due to the combination 
of data from different ECNs (electronic communications 
networks). We have used the algorithm of Chordia et 
al. [2^, which discards all trades where the difference 
between trade price and midquote price is larger than 4 
times the spread. The spread is defined as 5 as k — 5bid- I n 
addition, we have checked visually the return and trading 
volume time series surrounding the large price change on 
a tick by tick basis and have found no evidence for data 
errors after applying the filtering algorithm. The data fil- 
tering removes about one percent of all transactions and 
has a significant effect on the exponent of the cumulative 
distribution function P(G > X) ~ x~ a . For the raw 
data without any filtering, we find a — 2.1, after apply- 
ing the filter we find a = 3.9 by fitting a straight line in 
a double logarithmic diagram. We note that the filtering 
algorithm |22j is very restrictive in the sense that it dis- 
cards quite a few events where the TAQ data set reports 



FIG. 2: Relation between the actual return G and the pre- 
dicted return G pre d as calculated from the average price im- 
pact function / mar ket(Q) and the actual order volume. 



erratic and strong oscillations (of several tq) of the price 
which are probably due to the combination of data from 
different ECNs. While the price has already reached its 
new "true" value in the leading ECN, there may still be 
limit orders at the old price in some smaller ECNs which 
are exploited by arbitrage traders. While these oscilla- 
tions are "true" price changes in the sense that they are 
not due to recording errors, they are an artifact of the 
trading system and were not included in our analysis. 

Figure shows both the price impact function and 
those events with price changes larger than 5 standard 
deviations gq- We find 1198 such events for the TAQ 
data base and 210 for the Island ECN data. The large 
events cluster at quite small values of Q where the price 
impact function is significantly below G = b<?G- Sur- 
prisingly, for some of these events not even the sign of Q 
and G agree. We believe that this disagreement is mainly 
caused by the inaccuracy of the Lee and Ready algorithm, 
but the analysis of order book data reveals the existence 
of such situations as well. We note that even for large 
volume imbalances the average price impact function is 
several standard deviations (measured by the statistical 
error of the mean) below five <jg for the TAQ data. 

In Figure|21 the actual returns Git) are plotted against 
the predicted returns G pre d(£) = market (Q(t)) for the 
order book data. A linear fit to the data points has a 
slope of 2.58 indicating that the predicted returns are 
considerably smaller than the average one. The linear 
fit has an correlation coefficient R 2 — 0.72 and is not 
convincing visually. We conclude that the main cause 
for large returns is not a large imbalance between buy 
and sell orders but some other effect. 

Time varying price impact: As the average price 
impact function does not provide for a satisfactory ex- 
planation of large returns, we study the time dependent 
price impact. In a modern electronic market place, mar- 
ket orders are matched with limit orders stored in the 
order book. A buy limit order indicates that a trader is 
willing to buy a specified number of shares at a given or 
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FIG. 3: Price change as a function of buy or sell volume for 
ten of the largest price changes in the Island ECN data. 

lower price, while a sell limit order signals that a trader 
wants to buy a certain number of shares at a given or 
higher price. The buy limit order with the highest price 
determines the bid price, the sell limit order with the low- 
est price the ask price. The price change due to a given 
market order is determined by the limit orders stored 
in the order book. If a trader places a buy market or- 
der with volume AQ, it executes as many limit orders 
as necessary to fill that volume. In this way, the order 
book determines the price change due to a single market 
order. We describe the order book by a density function 
Pbook(7j^)j where the coordinate 

_ J (ln(5ii m it) - ln(Sbid)) limit buy order . , 
7 1 (MSitoit) - M^ask)) limit sell order ' [6 > 

describes the position in the order book. In our analysis, 
the orderbook density is defined on a discrete grid with 
spacing 0.3 oq. A small market buy order with volume 
AQ causes a return AG. For such an order, volume and 
return are related via AQ = p(0+,t)AG. For a larger 
order volume the relation is 

Q= [ G P(l,t)d 7 . (4) 
Jo 

The return G defined by Eq. 0] is denoted as the instan- 
taneous or virtual price impact. From this relation one 
sees that the same order volume Q can be related to quite 
different returns G depending on the value of p{p/, t). In 
the following, we will argue that it is this time depen- 
dence of the order book which is responsible for the oc- 
currence of large price changes. We note that from the 
order book one obtains only information about the price 
change as a function of buy or sell volume. Order book 
information can be related to time aggregated signed or- 
der volumes only under the assumption i) that the order 
book is symmetric around the midquote price and that 
ii) nonlincaritics can be neglected. Both assumptions are 
generally not satisfied. For this reason, we will consider 
either the buy or the sell volume in a given five minute 
interval, depending on the direction of the return in that 
interval. 



FIG. 4: Average price change as a function of buy or sell 
volume for all price changes larger than 5<tg in the Island 
ECN data. The average price change for all transactions is 
much smaller than that for the extreme events. 

When studying the price impact of the order flow in 
a given time interval, it is not sufficient to invoke the 
order book density pbook(7,i) at one instant of time. In 
addition, one has to consider changes in the order book 
which occur in this time interval. In |23| it was shown 
that the virtual price impact of a given order volume is 
roughly four times stronger than the actual price impact. 
This difference is due to additional limit orders placed 
in reaction to a price change. From this example one 
sees that the inclusion of dynamical effects is crucial for 
calculating the correct price impact. 

In order to calculate the density of additional limit or- 
ders arriving in a given time interval, we fix the reference 
frame by the midquote price in the beginning of the in- 
terval. The density of incoming limit orders is denoted 
by pflow(7)£), and the total order density is given by 

/o(7,t) = pho6k{l,t) +pfl ow (7)*> A*) . (5) 

From p(j, t) we calculate a price impact function 
^actuai(Q) by inverting the relation Eq. 0] The sell or- 
der side of this function for ten events with price changes 
larger than 5<tg is shown in Figure [3] In Figure QJ the 
average over all such events is compared to the average 
price impact function /market (0)- One sees that the slope 
of /actual (Q) is much steeper than the slope of /market (Q)- 

The fact that the price impact function for large events 
has a steeper slope than the average price impact func- 
tion implies that in time intervals with large price move- 
ments there are less limit orders available than on aver- 
age. Hence, the slope of the actual price impact function 
provides a measurement of the market liquidity. 

Some curves displayed in Figure |3| show marked non- 
linearities. The exponents found from power law fits vary 
between 0.15 and 2.35 with a mean of 1.32 and a standard 
deviation of 0.41. However, the average of the / ac tuai for 
all large events (see Figure is approximately linear, a 
power law fit yields an exponent of 1.03. As a measure 
for the strength of price impact, we define a susceptibility 
x(t) for the actual price impact function for a given time 
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FIG. 5: Ratio of actual price change to predicted price change 
plotted against the slope of the actual price impact function 
normalized by the slope of the average price impact function. 
The data points cluster in the vicinity of a linear fit. 

interval by a linear fit up to a price change of G = 5ac- 
Using this definition, we look for an explanation of ex- 
treme price changes. We compare the ratio of the actual 
price change G(t) and the predicted price change 

-^market (Q(t)) ■ (6) 

to the ratio of actual slope x(t) an d slope Xmarkct of 
the average price impact function / mar ket- As explained 



above, a price impact calculated from the order book can 
only be defined for either buy or sell volume. For this 
reason, we have recalculated /market by averaging with 
respect to cither the sell or the buy volume, depending 
on the sign of the price change. This recalculated / ma rkct 
is quite similar to the original one. To calculate G pic d{t), 
we use the buy volume for positive returns and the sell 
volume for negative ones. 

In Figure the ratio of G prc d and G is plotted against 
the susceptibility x/Xmarket for all events with \G\ > 5<jg- 
We see that the data points cluster in the vicinity of a 
straight line fit with an R 2 = 0.74. We conclude that the 
time dependent slope of the price impact function has 
a large explanatory power for the occurrence of extreme 
price changes. 

In summary, we have studied two alternative ap- 
proaches to explain large stock price changes: large fluc- 
tuations in trading volume and time changing liquidity 
for two different data sets. We find little evidence that 
extreme stock price changes are caused by large trading 
volume. For order book data, we are able to reconstruct 
the price impact for the time intervals with large returns. 
We find that the slope of this price impact strongly cor- 
relates with the ratio of observed return and the return 
predicted from the trading volume and the average price 
impact function. 
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